11-928 Master’s Thesis Symmetric Probabilistic Alignment

نویسنده

  • Jae Dong Kim
چکیده

The CMU Example-Based Machine Translation (EBMT) system has been deployed successfully in many projects for years. But even though a good alignment algorithm is essential since the CMU EBMT system uses parallel corpora, it has relatively less studied than other components of EBMT. For this reason, we developed a new alignment algorithm which uses statistical information drawn from parallel corpora and heuristics based on human linguistic knowledge. Unlike most alignment approaches in Statistical Machine Translation (SMT) systems, our alignment algorithm uses only bilingual dictionaries as statistical information trained from other systems, calculates alignment scores bi-directionally and aims at aligning up to 8 words long source fragments. In our experiments so far, it outperformed the old heuristic-based alignment algorithm in both alignment accuracy and translation accuracy in EBMT. Its performance was very close to the the state-of-the-art in SMT systems for which we picked IBM Model 4 for comparison, and a combination of our new method and IBM Model 4 performed best.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of Polygon Reduction Algorithms for Symmetric 3D Models

Development of Polygon Reduction Algorithms for Symmetric 3D Models This Master’s thesis describes two polygon reduction algorithms suitable for symmetric 3D models. Also, a continuous symmetry measure is developed which makes it possible to, on a continuous scale, quantify the amount of symmetry an object possesses. Typically, a polygon reduction algorithm takes a 3D model as input and generat...

متن کامل

Attack-tree based risk analysis of Estonian i-voting

This report analyzes two independent works published in 2014 that model security threats of Estonian i-voting scheme using attack trees. The first one, the master’s thesis of Tanel Torn [11] constructs several realistic attack trees for various types of attacks on Estonian i-voting system and evaluates them using three different state-of-the-art methodologies proposed in attack-tree literature....

متن کامل

Master’s Thesis Research Proposal

This is a proposal for the research I wish to do for my Master’s thesis. It is an attempt to categorize what I know, what I don’t know, what I need to do, and where I need help. It also consists of my attempt to completely survey the literature.

متن کامل

Symmetric Probabilistic Alignment for Example-Based Translation

Since subsentential alignment is critically important to the translation quality of an Example-Based Machine Translation (EBMT) system which operates by finding and combining phrase-level matches against the training examples, we recently decided to develop a new alignment algorithm for the purpose of improving the EBMT system’s performance. Unlike most algorithms in the literature, this new Sy...

متن کامل

Microwavave Tomography for Breast Cancer Detection Master’s thesis in Master’s of Biomedical Engineering

...................................................................................................................vii Acknowledgement ................................................................................................. viii

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006